Some recent works revealed that deep neural networks (DNNs) are vulnerable toso-called adversarial attacks where input examples are intentionally perturbedto fool DNNs. In this work, we revisit the DNN training process that includesadversarial examples into the training dataset so as to improve DNN'sresilience to adversarial attacks, namely, adversarial training. Ourexperiments show that different adversarial strengths, i.e., perturbationlevels of adversarial examples, have different working zones to resist theattack. Based on the observation, we propose a multi-strength adversarialtraining method (MAT) that combines the adversarial training examples withdifferent adversarial strengths to defend adversarial attacks. Two trainingstructures - mixed MAT and parallel MAT - are developed to facilitate thetradeoffs between training time and memory occupation. Our results show thatMAT can substantially minimize the accuracy degradation of deep learningsystems to adversarial attacks on MNIST, CIFAR-10, CIFAR-100, and SVHN.
展开▼